AITopics | unsupervised environment design

Grounding Aleatoric Uncertainty for Unsupervised Environment Design

Neural Information Processing SystemsDec-25-2025, 09:32:50 GMT

Adaptive curricula in reinforcement learning (RL) have proven effective for producing policies robust to discrepancies between the train and test environment.

ground-truth distribution, grounding aleatoric uncertainty, unsupervised environment design, (4 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.57)

Add feedback

Emergent Complexity and Zero-shot Transfer via Unsupervised Environment Design

Neural Information Processing SystemsDec-24-2025, 08:21:00 GMT

A wide range of reinforcement learning (RL) problems --- including robustness, transfer learning, unsupervised RL, and emergent complexity --- require specifying a distribution of tasks or environments in which a policy will be trained. However, creating a useful distribution of environments is error prone, and takes a significant amount of developer time and effort. We propose Unsupervised Environment Design (UED) as an alternative paradigm, where developers provide environments with unknown parameters, and these parameters are used to automatically produce a distribution over valid, solvable environments. Existing approaches to automatically generating environments suffer from common failure modes: domain randomization cannot generate structure or adapt the difficulty of the environment to the agent's learning progress, and minimax adversarial training leads to worst-case environments that are often unsolvable. To generate structured, solvable environments for our protagonist agent, we introduce a second, antagonist agent that is allied with the environment-generating adversary. The adversary is motivated to generate environments which maximize regret, defined as the difference between the protagonist and antagonist agent's return. We call our technique Protagonist Antagonist Induced Regret Environment Design (PAIRED). Our experiments demonstrate that PAIRED produces a natural curriculum of increasingly complex environments, and PAIRED agents achieve higher zero-shot transfer performance when tested in highly novel environments.

emergent complexity and zero-shot transfer, name change, unsupervised environment design, (6 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.76)

Add feedback

TRACED: Transition-aware Regret Approximation with Co-learnability for Environment Design

Cho, Geonwoo, Im, Jaegyun, Lee, Jihwan, Yi, Hojun, Kim, Sejin, Kim, Sundong

arXiv.org Artificial IntelligenceDec-4-2025

Generalizing deep reinforcement learning agents to unseen environments remains a significant challenge. One promising solution is Unsupervised Environment Design (UED), a co-evolutionary framework in which a teacher adaptively generates tasks with high learning potential, while a student learns a robust policy from this evolving curriculum. Existing UED methods typically measure learning potential via regret, the gap between optimal and current performance, approximated solely by value-function loss. Building on these approaches, we introduce the transition-prediction error as an additional term in our regret approximation. To capture how training on one task affects performance on others, we further propose a lightweight metric called Co-Learnability. By combining these two measures, we present Transition-aware Regret Approximation with Co-learnability for Environment Design (TRACED). Empirical evaluations show that TRACED produces curricula that improve zero-shot generalization over strong baselines across multiple benchmarks. Ablation studies confirm that the transition-prediction error drives rapid complexity ramp-up and that Co-Learnability delivers additional gains when paired with the transition-prediction error. These results demonstrate how refined regret approximation and explicit modeling of task relationships can be leveraged for sample-efficient curriculum design in UED. Project Page: https://geonwoo.me/traced/

machine learning, natural language, reinforcement learning, (18 more...)

arXiv.org Artificial Intelligence

2506.19997

Genre: Research Report > New Finding (1.00)

Industry: Education > Curriculum (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

985e9a46e10005356bbaf194249f6856-Paper.pdf

Neural Information Processing SystemsAug-15-2025, 06:50:06 GMT

adversary, agent, arxiv preprint arxiv, (11 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Alameda County > Berkeley (0.14)
North America > United States > New Jersey (0.04)
North America > United States > California > Santa Clara County > Mountain View (0.04)
(3 more...)

Genre: Research Report > New Finding (0.68)

Industry: Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.47)

Add feedback

Improving Environment Novelty Quantification for Effective Unsupervised Environment Design

Neural Information Processing SystemsMay-27-2025, 21:17:57 GMT

Unsupervised Environment Design (UED) formalizes the problem of autocurricula through interactive training between a teacher agent and a student agent. The teacher generates new training environments with high learning potential, curating an adaptive curriculum that strengthens the student's ability to handle unseen scenarios. Existing UED methods mainly rely on regret, a metric that measures the difference between the agent's optimal and actual performance, to guide curriculum design. Regret-driven methods generate curricula that progressively increase environment complexity for the student but overlook environment novelty -- a critical element for enhancing an agent's generalizability. Measuring environment novelty is especially challenging due to the underspecified nature of environment parameters in UED, and existing approaches face significant limitations.

effective unsupervised environment design, environment novelty quantification, unsupervised environment design, (6 more...)

Neural Information Processing Systems

Industry: Education (1.00)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.61)

Add feedback

Emergent Complexity and Zero-shot Transfer via Unsupervised Environment Design

Neural Information Processing SystemsMay-27-2025, 06:32:59 GMT

A wide range of reinforcement learning (RL) problems --- including robustness, transfer learning, unsupervised RL, and emergent complexity --- require specifying a distribution of tasks or environments in which a policy will be trained. However, creating a useful distribution of environments is error prone, and takes a significant amount of developer time and effort. We propose Unsupervised Environment Design (UED) as an alternative paradigm, where developers provide environments with unknown parameters, and these parameters are used to automatically produce a distribution over valid, solvable environments. Existing approaches to automatically generating environments suffer from common failure modes: domain randomization cannot generate structure or adapt the difficulty of the environment to the agent's learning progress, and minimax adversarial training leads to worst-case environments that are often unsolvable. To generate structured, solvable environments for our protagonist agent, we introduce a second, antagonist agent that is allied with the environment-generating adversary.

large language model, machine learning, natural language, (7 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.80)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.44)

Add feedback

Review for NeurIPS paper: Emergent Complexity and Zero-shot Transfer via Unsupervised Environment Design

Neural Information Processing SystemsJan-26-2025, 19:55:04 GMT

Additional Feedback: Could you add a specific example/ problem that would be easily solved by defining it as a UED? I think it would help the paper in general. The agent for the Lava environment, that replaces walls with dangerous lava, is trained from generated maps with lava instead of walls? Additionally, the paper needs further proof reading, some minor mistakes I found: *Line 511: I wouldn't start a proof section saying "it would be nice to know that..." that is too informal *Line 512: "their" should be "its" *Line 40: Section?, i.e., referenced section is missing the number *Line 138: the function T M shouldn't be defined on S M? *Line 171: This sentence needs further explanation *Line 207: based on twice *Line 209: Figure? The Broader Impact section, specially the first paragraph is too speculative, automating jobs or automated weapons are general problems of the AI field, it should focus more on the impact of this specific work.

emergent complexity and zero-shot transfer, neurips paper, unsupervised environment design, (1 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.40)

Add feedback

Review for NeurIPS paper: Emergent Complexity and Zero-shot Transfer via Unsupervised Environment Design

Neural Information Processing SystemsJan-26-2025, 19:54:57 GMT

This paper pursues a significant line of enquiry regarding an important topic: automatic, unsupervised environment design. The paper makes algorithmic, theoretical, and empirical contributions. While the reviewers had some concerns about the clarity of the theory and the adequacy of the empirical results, these have been well addressed in the rebuttal. The authors are strongly urged to incorporate all the reviewers' feedback in the final version.

emergent complexity and zero-shot transfer, neurips paper, unsupervised environment design, (1 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.40)

Add feedback

Grounding Aleatoric Uncertainty for Unsupervised Environment Design

Neural Information Processing SystemsJan-18-2025, 23:57:06 GMT

Adaptive curricula in reinforcement learning (RL) have proven effective for producing policies robust to discrepancies between the train and test environment. Problematically, in partially-observable or stochastic settings, optimal policies may depend on the ground-truth distribution over aleatoric parameters of the environment in the intended deployment setting, while curriculum learning necessarily shifts the training distribution. We formalize this phenomenon as curriculum-induced covariate shift (CICS), and describe how its occurrence in aleatoric parameters can lead to suboptimal policies. Directly sampling these parameters from the ground-truth distribution avoids the issue, but thwarts curriculum learning. We propose SAMPLR, a minimax regret UED method that optimizes the ground-truth utility function, even when the underlying training data is biased due to CICS.

ground-truth distribution, grounding aleatoric uncertainty, unsupervised environment design, (2 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.82)

Add feedback

Emergent Complexity and Zero-shot Transfer via Unsupervised Environment Design

Neural Information Processing SystemsOct-10-2024, 20:44:28 GMT

A wide range of reinforcement learning (RL) problems --- including robustness, transfer learning, unsupervised RL, and emergent complexity --- require specifying a distribution of tasks or environments in which a policy will be trained. However, creating a useful distribution of environments is error prone, and takes a significant amount of developer time and effort. We propose Unsupervised Environment Design (UED) as an alternative paradigm, where developers provide environments with unknown parameters, and these parameters are used to automatically produce a distribution over valid, solvable environments. Existing approaches to automatically generating environments suffer from common failure modes: domain randomization cannot generate structure or adapt the difficulty of the environment to the agent's learning progress, and minimax adversarial training leads to worst-case environments that are often unsolvable. To generate structured, solvable environments for our protagonist agent, we introduce a second, antagonist agent that is allied with the environment-generating adversary.

agent, emergent complexity and zero-shot transfer, unsupervised environment design, (3 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.80)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.44)

Add feedback

Filters

Collaborating Authors

unsupervised environment design

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

Grounding Aleatoric Uncertainty for Unsupervised Environment Design

Emergent Complexity and Zero-shot Transfer via Unsupervised Environment Design

TRACED: Transition-aware Regret Approximation with Co-learnability for Environment Design

985e9a46e10005356bbaf194249f6856-Paper.pdf

Improving Environment Novelty Quantification for Effective Unsupervised Environment Design

Emergent Complexity and Zero-shot Transfer via Unsupervised Environment Design

Review for NeurIPS paper: Emergent Complexity and Zero-shot Transfer via Unsupervised Environment Design

Review for NeurIPS paper: Emergent Complexity and Zero-shot Transfer via Unsupervised Environment Design

Grounding Aleatoric Uncertainty for Unsupervised Environment Design

Emergent Complexity and Zero-shot Transfer via Unsupervised Environment Design